This lecture is to introduce the elements in R programming language that are relevant to decision model and decision analysis.
vector, matrix, array, list, and data.frame.x1 <- c(3, 1, 4)
print(x1)
## [1] 3 1 4
x2 <- c(2, 7, 1)
print(x1 + x2)
## [1] 5 8 5
future_human <- c("earther", "martian", "belter")
print(future_human)
## [1] "earther" "martian" "belter"
You could combine different type of vectors
x1_and_future_human <- c(x1, future_human)
print(x1_and_future_human)
## [1] "3" "1" "4" "earther" "martian" "belter"
We often create the initial status in the Markov model using vector
v_init <- c(0.9, 0.09, 0.01)
names(v_init) <- c("healthy", "sick", "dead")
print(v_init)
## healthy sick dead
## 0.90 0.09 0.01
print(sum(v_init))
## [1] 1
print(v_init[2])
## sick
## 0.09
print(v_init["sick"])
## sick
## 0.09
print(v_init[2:3])
## sick dead
## 0.09 0.01
print(v_init[c("sick", "dead")])
## sick dead
## 0.09 0.01
x <- matrix(c(1, 0, 0,
0.8, 0.2, 0,
0, 0, 1),
byrow = T, nrow = 3)
print(x)
## [,1] [,2] [,3]
## [1,] 1.0 0.0 0
## [2,] 0.8 0.2 0
## [3,] 0.0 0.0 1
rownames(x) <- colnames(x) <- c("healthy", "sick", "dead")
print(x)
## healthy sick dead
## healthy 1.0 0.0 0
## sick 0.8 0.2 0
## dead 0.0 0.0 1
print(x[2, 2])
## [1] 0.2
print(x["sick", "sick"])
## [1] 0.2
Matrix multiplication is often used in Markov models. For example, we can multiply the initial state with the transition matrix
v_init is not a matrix. Thus, we use t() to convert v_init into a \(1 \times 3\) matrix.%*% is the symbol for matrix multiplication in Rprint(t(v_init) %*% x)
## healthy sick dead
## [1,] 0.972 0.018 0.01
array() is more useful.array() can be seen as a data type that stores multiple matrices all at once.tr1 <- x
tr2 <- matrix(c(0.9, 0.1, 0,
0.7, 0.2, 0.1,
0, 0, 1),
byrow = T, nrow = 3,
dimnames = list(c("healthy", "sick", "dead"),
c("healthy", "sick", "dead")))
tr_array <- array(dim = c(3, 3, 2),
data = cbind(tr1, tr2),
dimnames = list(c("healthy", "sick", "dead"),
c("healthy", "sick", "dead"),
c(1, 2)))
print(tr_array)
## , , 1
##
## healthy sick dead
## healthy 1.0 0.0 0
## sick 0.8 0.2 0
## dead 0.0 0.0 1
##
## , , 2
##
## healthy sick dead
## healthy 0.9 0.1 0.0
## sick 0.7 0.2 0.1
## dead 0.0 0.0 1.0
print(tr_array[ , , 1])
## healthy sick dead
## healthy 1.0 0.0 0
## sick 0.8 0.2 0
## dead 0.0 0.0 1
print(tr_array[ , , 2])
## healthy sick dead
## healthy 0.9 0.1 0.0
## sick 0.7 0.2 0.1
## dead 0.0 0.0 1.0
print(tr_array[1, 2, 1])
## [1] 0
v_time2 <- t(v_init) %*% tr_array[ , , 1]
print(v_time2)
## healthy sick dead
## [1,] 0.972 0.018 0.01
v_time3 <- v_time2 %*% tr_array[ , , 2]
print(v_time3)
## healthy sick dead
## [1,] 0.8874 0.1008 0.0118
temp_ls <- list(future_human = future_human, v_init = v_init, tr_array = tr_array)
print(temp_ls)
## $future_human
## [1] "earther" "martian" "belter"
##
## $v_init
## healthy sick dead
## 0.90 0.09 0.01
##
## $tr_array
## , , 1
##
## healthy sick dead
## healthy 1.0 0.0 0
## sick 0.8 0.2 0
## dead 0.0 0.0 1
##
## , , 2
##
## healthy sick dead
## healthy 0.9 0.1 0.0
## sick 0.7 0.2 0.1
## dead 0.0 0.0 1.0
print(temp_ls[[1]])
## [1] "earther" "martian" "belter"
print(temp_ls$v_init)
## healthy sick dead
## 0.90 0.09 0.01
print(temp_ls$tr_array[, , 1])
## healthy sick dead
## healthy 1.0 0.0 0
## sick 0.8 0.2 0
## dead 0.0 0.0 1
data.frameR, you probably have encountered data.frame() very frequently.data.frame()profile <- data.frame(name = c("Amos", "Bobbie", "Naomi"),
human_type = c("earther", "martian", "belter"),
height = c(1.8, 2.1, 1.78))
print(profile)
## name human_type height
## 1 Amos earther 1.80
## 2 Bobbie martian 2.10
## 3 Naomi belter 1.78
data.frame() is essentially a named list of vectors with the same length.typeof(profile)
## [1] "list"
length(profile$name)
## [1] 3
length(profile$human_type)
## [1] 3
length(profile$height)
## [1] 3
data.frame()profile[profile$name == "Bobbie", ]
## name human_type height
## 2 Bobbie martian 2.1
profile[profile$name == "Bobbie", "height"]
## [1] 2.1
setwd("path-to-your-project-folder")
setwd("zoeslaptop/Documents/CEA/introR/")
setwd() function sets up your working directory and allows you to access the files and folders in the working directory much easier.For example, I want to run the R script hello.R in the Rscript/ folder in your working directory.
source("zoeslaptop/Documents/CEA/introR/Rscript/hello.R")source("Rscript/hello.R")The format of path could be slightly different between windows, unix, and linux
RStudio provides a very nice feature project. A project sets up the working environment automatically. Whenever you open a project, you are in the working directory to a specific project alreadly!
You could find more information about creating an R project and other useful tip here.
R data format family:
.RData, .rda, and .rds.load() and readRDS()You can save the data using save() and saveRDS().
It is common that you might encounter other data types (e.g., .csv and .txt) or even data format for other software (e.g., Stata and SAS).
To read or save other types of data, you could find the information here and here.
Components included in loops
The most used loop is for loop in many decision models. The structure follows:
for (i in c(start : end)) {
# Routine / Process
}
Example 1: Fibonacci sequence is the sum of the two preceding numbers. We start the sequence from 0 and 1. What are the first 10 Fibonacci numbers? (0 and 1 are the 1st and 2nd Fibonacci number, respectively.)
Note that we want to get all the 10 Fibonacci numbers. We need to create a vector to store all the 10 numbers.
fib_vec.fib_vec <- rep(0, 10)
fib_vec[c(1:2)] <- c(0, 1)
print(fib_vec)
## [1] 0 1 0 0 0 0 0 0 0 0
for (i in c(3 : 10)) {
fib_vec[i] <- fib_vec[i - 1] + fib_vec[i - 2]
}
print(fib_vec)
## [1] 0 1 1 2 3 5 8 13 21 34
Example 2: In year 2400, there are 3000 martians in Mars colony. The growth rate of the martian population follows this formula \(0.05 P(t) \bigl(1 - \frac{P(t)}{10000})\). What is the population size in year 2500?
# initialize a vector of population size over the next 100 years
popsize <- rep(3000, 100)
# Calculate
for (t in c(1 : 100)) {
popsize[t + 1] <- popsize[t] + 0.05 * popsize[t] * (1 - popsize[t] / 100000)
}
popdf <- data.frame(year = c(2400:2500),
popsize = popsize)
print(popdf[popdf$year == 2500, ])
## year popsize
## 101 2500 81499.86
if (condition1) {
# Execute some code
} else if (condition2) {
# Execute some code
} else {
# Execute some code
}
else if and else are not always needed.if()"yoda" == "windu"
## [1] FALSE
Jedi <- c("yoda", "windu", "kenobi")
"yoda" %in% Jedi
## [1] TRUE
if() is true.Mandalorian <- c("satine", "sabine", "jango")
x <- "yoda"
if (x %in% Jedi) {
print("May the force be with you!")
} else if (x %in% Mandalorian) {
print("This is the way!")
} else {
print("Hello, world!")
}
## [1] "May the force be with you!"
"Yoda", you get "Hello, world!"R, e.g., lm(), sum(), print(), etc.The primary components of an R function are the body(), formals(), and environment(). We often want to return the results from the function.
body(): the code inside the function.formals(): the input arguments.environment(): where the variables in the function are located.return(): return the relevant outputspeak <- function(x) {
Jedi <- c("yoda", "windu", "kenobi")
Mandalorian <- c("satine", "sabine", "jango")
membership <- "Not Jedi or Mandalorian"
if (x %in% Jedi) {
say <- "May the force be with you!"
membership <- "Jedi"
} else if (x %in% Mandalorian) {
say <- "This is the way!"
membership <- "Mandalorian"
} else {
say <- "Hello, world!"
}
return(list(say = say, membership = membership))
}
formals(speak)
## $x
body(speak)
## {
## Jedi <- c("yoda", "windu", "kenobi")
## Mandalorian <- c("satine", "sabine", "jango")
## membership <- "Not Jedi or Mandalorian"
## if (x %in% Jedi) {
## say <- "May the force be with you!"
## membership <- "Jedi"
## }
## else if (x %in% Mandalorian) {
## say <- "This is the way!"
## membership <- "Mandalorian"
## }
## else {
## say <- "Hello, world!"
## }
## return(list(say = say, membership = membership))
## }
environment(speak)
## <environment: R_GlobalEnv>
speak("jango")
## $say
## [1] "This is the way!"
##
## $membership
## [1] "Mandalorian"
Question: How would you program the Matian population growth in a function? What are the input arguments? What are the results return from the function?
x <- rnorm(1000, 0, 1) # draw 1000 samples from a normal distribution
print(mean(x))
## [1] 0.01740754
print(sd(x))
## [1] 1.041297
hist(x, freq = F, col = "gray")
library(CEAutil)
data(worldHE)
print(head(worldHE))
## Entity Code Year LEyr HealthExp Pop
## 1 Afghanistan AFG 1800 NA NA 3280000
## 2 Afghanistan AFG 1801 NA NA 3280000
## 3 Afghanistan AFG 1802 NA NA 3280000
## 4 Afghanistan AFG 1803 NA NA 3280000
## 5 Afghanistan AFG 1804 NA NA 3280000
## 6 Afghanistan AFG 1805 NA NA 3280000
worldHE2010 <- worldHE[worldHE$Year == 2010, ]
hist(worldHE2010$LEyr, main = "Life expectancy in 2010")
hist(worldHE2010$HealthExp, main = "Health expenditure in 2010 per capita")
plot(worldHE2010$HealthExp, worldHE2010$LEyr, ylim = c(70, 90),
xlab = "health expenditure", ylab = "life expectancy")
text(worldHE2010$HealthExp, worldHE2010$LEyr, labels = worldHE2010$Entity, cex = 0.8)
worldHE_US <- worldHE[worldHE$Entity == "United States" & worldHE$Year >= 1970 & worldHE$Year <= 2015, ]
# adding lines: UK's expenditure at the same time period
worldHE_UK <- worldHE[worldHE$Entity == "United Kingdom" & worldHE$Year >= 1970 & worldHE$Year <= 2015, ]
plot(worldHE_US$Year, worldHE_US$HealthExp, type = "l", col = "red",
ylab = "expenditure", xlab = "year", ylim = c(500, 10000))
lines(worldHE_UK$Year, worldHE_UK$HealthExp, col = "royalblue")
legend("topleft", legend=c("US", "UK"),
col=c("red", "royalblue"), lty = 1, cex = 0.8)
ggplot2ggplot2, dampack, and CEAutil.if(!require(ggplot2)) install.packages("ggplot2")
if(!require(devtools)) install.packages("devtools")
if(!require(remotes)) install.packages("remotes")
if(!require(dampack)) remotes::install_github("DARTH-git/dampack", dependencies = TRUE)
if(!require(CEAutil)) remotes::install_github("syzoekao/CEAutil", dependencies = TRUE)
For a simple decision tree, we can draw the tree easily. As the decision tree grows, there is software helping you draw the tree such as TreeAge and Amua. In this class, we use Amua because it is free. We will show how to use Amua to build a decision tree and export the tree to R script for CEA analysis.
Follow the instruction here to install Amua. Be aware of the difference between Mac and Windows users! After you install Amua, remember where you store the software.
Amua.jar.temp_Export/..R files: main.R and functions.R.main.R.Dracula is preparing for a spring break party tonight. But there’s one final decision that Dracula is struggling with. Being a vampire, he intends to bite and suck the blood of some of his guests at the party (about 25% of them, by his best estimate). While being bitten by a vampire won’t turn his guests into vampires or zombies, like some horror movies might suggest, there is a 50% chance that a vampire bite results in a rather severe bacterial infection, of which 66% of cases require hospitalization for an average of 1 night.
Being a gracious host, Dracula is considering different ways of administering antibiotic prophylaxis to his guests to reduce their risk of infection. One option is to administer the antibiotics to his victim‐guests just before he bites them – this would reduce the risk of a vampire bite infection by 20%. However, the antibiotics can be even more effective, reducing the risk of infection by 90%, if administered at least 30 minutes before being bitten. To achieve this, Dracula is considering putting the antibiotics into the drinks served at the party to ensure that his guests are all properly dosed before he bites his victims. But this means that all his guests would be exposed to the antibiotics (not just those he intends to bite), and he knows that about 5% of people are severely allergic to these antibiotics and would require immediate hospitalization if exposed. Dracula is therefore also considering not administering antibiotic prophylaxis at all to avoid this harm.
However, all the healthy blood doesn’t come without cost. This party will cost Dracula $1000 for a total of 200 guests (an average of $5 per guest). In addition, Dracula expects the cost of antibiotic to be $10 for each guest he bites. If Dracula decides to administer the antibiotics in all the drinks served at the party, the total cost of antibioitics is expected to be $100 (Dracula gets discount for buying a large batch of antibiotics!). Also, Dracula is willing to pay for the cost of hospitalization for any guest who experience bacteiral infection due to his bites because he feels responsible. The cost of hospitalization per person per night is $500. Dracula expects to get an average of 470 mL healthy blood by biting a guest. However, if the guest ends up hospitalized due to bacterial infection, the healthy blood that Dracula can get is reduced by 10%. Dracula hopes that you can help him determine which strategy he should choose.
Think about the following quesitons:
You can build a decision tree via Amua.
if(!require("remotes")) install.packages("remotes")
if(!require("dplyr")) install.packages("dplyr")
if(!require("CEAutil")) remotes::install_github("syzoekao/CEAutil", dependencies = TRUE)
if(!require(dampack)) remotes::install_github("DARTH-git/dampack", dependencies = TRUE)
library(CEAutil)
parse_amua_tree():This function only takes an input argument, the path to the main.R file of the Amua decision model. The function returns a list of outputs. Output 1 param_ls is a list of input parameters with basecase values used in Amua. Output 2 treefunc is the R code of the Amua decision model formatted as text.
treetxt <- parse_amua_tree("AmuaExample/DraculaParty_Export/main.R")
print(treetxt$param_ls)
## $p_bite
## [1] 0.25
##
## $p_inf_bitten
## [1] 0.5
##
## $p_inf_not_allergic
## [1] 0.4
##
## $p_inf_bitten_UA
## [1] 0.05
##
## $C_hospital
## [1] 500
##
## $C_donothing
## [1] 5
##
## $C_drug
## [1] 10
##
## $C_drug_UA
## [1] 0.5
##
## $p_hospital
## [1] 0.66
##
## $p_not_allergic
## [1] 0.95
##
## $red_blood
## [1] 0.1
##
## $val_blood
## [1] 470
dectree_wrapper():We use the two outputs from the parse_amua_tree() as the input arguments in the dectree_wrapper(). The input arguments in the wrapper function include params_basecase, treefunc, and popsize. params_basecase takes a list of named input parameters. treefunc takes the text file organized by the parse_amua_tree(). popsize is defaulted as 1 but you could change your population size.
param_ls <- treetxt[["param_ls"]]
treefunc <- treetxt[["treefunc"]]
tree_output <- dectree_wrapper(params_basecase = param_ls, treefunc = treefunc, popsize = 1)
print(tree_output)
## strategy expectedBlood expectedHospitalization Cost
## 1 Donothing 113.6225 0.0825000 46.25000
## 2 Targetedantibiotics 108.6781 0.0752000 45.10000
## 3 Universalantibiotics 111.2566 0.0578375 34.41875
tree_output <- dectree_wrapper(params_basecase = param_ls, treefunc = treefunc, popsize = 200)
print(tree_output)
## strategy expectedBlood expectedHospitalization Cost
## 1 Donothing 22724.50 16.5000 9250.00
## 2 Targetedantibiotics 21735.62 15.0400 9020.00
## 3 Universalantibiotics 22251.33 11.5675 6883.75
library(dampack)
dracula_icer <- calculate_icers(tree_output$Cost,
tree_output$expectedBlood,
tree_output$strategy)
print(dracula_icer)
## Strategy Cost Effect Inc_Cost Inc_Effect ICER Status
## 1 Universalantibiotics 6883.75 22251.33 NA NA NA ND
## 2 Donothing 9250.00 22724.50 2366.25 473.1725 5.000819 ND
## 3 Targetedantibiotics 9020.00 21735.62 NA NA NA D
plot(dracula_icer, effect_units = "mL")
Example 1. Dracula has been starved over the long cruel winter in Minnesota. The Spring break is the first time that his guests are willing to come to his party in several months. Therefore, Dracula is going to seize the chance to bite as many guests as possible. The probability that he bites a guest is now increased to 50%. What are the cost and effectiveness of each strategy?
param_ls$p_bite <- 0.5
tree_output <- dectree_wrapper(params_basecase = param_ls, treefunc = treefunc, popsize = 200)
print(tree_output)
## strategy expectedBlood expectedHospitalization Cost
## 1 Donothing 45449.00 33.000 17500.0
## 2 Targetedantibiotics 43471.24 30.080 17040.0
## 3 Universalantibiotics 44502.65 13.135 7667.5
dracula_icer <- calculate_icers(tree_output$Cost,
tree_output$expectedBlood,
tree_output$strategy)
print(dracula_icer)
## Strategy Cost Effect Inc_Cost Inc_Effect ICER Status
## 1 Universalantibiotics 7667.5 44502.65 NA NA NA ND
## 2 Donothing 17500.0 45449.00 9832.5 946.345 10.38997 ND
## 3 Targetedantibiotics 17040.0 43471.24 NA NA NA D
plot(dracula_icer, effect_units = "mL")
Example 2. The cost of hospitalization due to bacterial infection varies from guest to guest depending on the healthcare that a guest has. Overall, the cost of hospitalization has a mean of $500 with a standard deviation $300 following a gamma distribution. What are the mean cost and effectiveness of the party across all 200 guests?
require(dampack)
C_hospital <- gen_psa_samp(params = c("C_hospital"),
dists = c("gamma"),
parameterization_types = c("mean, sd"),
dists_params = list(c(500, 250)),
nsamp = 200)
cat("mean and sd are", c(mean(C_hospital$C_hospital), sd(C_hospital$C_hospital)), ", respectively.")
## mean and sd are 512.8231 224.5915 , respectively.
hist(C_hospital$C_hospital, main = "histogram", xlab = "values", ylab = "frequency", col = "gray", border = F)
abline(v = mean(C_hospital$C_hospital), col = "firebrick", lwd = 3)
text(mean(C_hospital$C_hospital) + 50, 30, "mean\ncost", col = "firebrick", font = 2)
tree_vary <- dectree_wrapper(params_basecase = param_ls,
treefunc = treefunc,
vary_param_samp = C_hospital)
## Warning in dectree_wrapper(params_basecase = param_ls, treefunc = treefunc, :
## nsamp are not the parameters included in params_basecase. The function will
## ingore these parameters.
print(names(tree_vary))
## [1] "expectedBlood" "expectedHospitalization"
## [3] "Cost" "param_samp"
print(head(tree_vary$param_samp))
## nsamp C_hospital
## 1 1 546.5736
## 2 2 542.3999
## 3 3 404.3739
## 4 4 384.7145
## 5 5 332.0726
## 6 6 825.7011
print(head(tree_vary$expectedBlood))
## Donothing Targetedantibiotics Universalantibiotics
## 1 227.245 217.3562 222.5133
## 2 227.245 217.3562 222.5133
## 3 227.245 217.3562 222.5133
## 4 227.245 217.3562 222.5133
## 5 227.245 217.3562 222.5133
## 6 227.245 217.3562 222.5133
print(head(tree_vary$Cost))
## Donothing Targetedantibiotics Universalantibiotics
## 1 95.18464 92.20467 41.39622
## 2 94.49598 91.57694 41.12211
## 3 71.72170 70.81784 32.05726
## 4 68.47789 67.86106 30.76613
## 5 59.79198 59.94372 27.30887
## 6 141.24068 134.18544 59.72792
print(summary(tree_vary$Cost))
## Donothing Targetedantibiotics Universalantibiotics
## Min. : 11.94 Min. : 16.33 Min. : 8.264
## 1st Qu.: 60.27 1st Qu.: 60.38 1st Qu.:27.501
## Median : 87.05 Median : 84.79 Median :38.158
## Mean : 89.62 Mean : 87.13 Mean :39.180
## 3rd Qu.:114.17 3rd Qu.:109.51 3rd Qu.:48.951
## Max. :210.52 Max. :197.33 Max. :87.302
hist(tree_vary$Cost$Donothing, col = rgb(0.5, 0.5, 0.5, 0.6), border = F,
main = c("Cost of hospitalization"),
xlab = "Cost")
hist(tree_vary$Cost$Targetedantibiotics, col = rgb(0.2, 0.2, 0.8, 0.5), border = F, add = T)
hist(tree_vary$Cost$Universalantibiotics, col = rgb(0.3, 0.8, 0.2, 0.5), border = F, add = T)
legend("topright", c("do nothing", "targeted", "universal"),
col = c(rgb(0.5, 0.5, 0.5, 0.6), rgb(0.2, 0.2, 0.8, 0.5), rgb(0.3, 0.8, 0.2, 0.5)),
pch = 15, pt.cex = 2)
markov_model <- function(l_param_all,
strategy = NULL) {
#### 1. Read in, define, or transform parameters if needed
#### 2. Create the transition probability matrices using array
#### 3. Create the trace matrix to track the changes in the population distribution
#### through time. You could also create other matrix to track different outcomes,
#### e.g., costs, incidence, etc.
#### 4. Get outputs
#### 5. Return the relevant results
}
markov_model()The Canadian province of Ontario is considering a province-wide ban on indoor tanning as a means of preventing skin cancer, with a focus on young women. The Ontario Ministry of Health has just finished a large observational study on tanning behaviors and skin cancer incidence among women in Ontario to inform their decision.
In their surveillance study, they found that skin cancer risks differ substantially for individuals who visit tanning salons regularly (“regular tanners”) and those who do not (“non-tanners”). The annual risk of skin cancer was 4% for regular tanners vs. 0.5% for non-tanners. (Assume that skin cancer risk depends only on current tanning behavior and not on tanning history).
The Ministry of Health also studied tanning behaviors. They estimated the annual probability of a non-tanner becoming a regular tanner, by age, among women. This probability begins increasing around age 12, peaks at age 24 and then begins to decrease. They also estimated the rate of a regular tanner becoming a non-tanner. This probability is low until age 30, after which it increases with age. These data are summarized in the dataset ONtan in the CEAutil package.
Skin cancer resolves within one year of diagnosis, with 7% of cases resulting in death. Those who recover following a skin cancer diagnosis nearly all quit tanning, though they re-initiate indoor tanning at the same rate as their peers who have not experienced skin cancer.
The Ontario Ministry of Health is considering an indoor tanning ban for those 18 years of age and younger (reducing the rate of tanning initiation to zero among this age group). They are also considering a full indoor tanning ban, which would reduce the rate of tanning initiation to zero for everyone in Ontario. They would like to evaluate the impact that each of these policies would have over the lifetime of a cohort of 10-year-old girls. The Ministry of Health does not wish to discount outcomes.
Draw a Markov model diagram of tanning behavior and skin cancer that the Ontario Ministry of Health could use to evaluate their tanning policies in terms of the desired outcomes. Use a yearly time step. Include the following states: “Non-tanners”, “Regular tanners”, “Skin cancer”, “Dead from skin cancer”, and “Dead from other causes”. Label all transition probabilities and indicate which are changing over time.
Calcuate the remaining life-expectancy of a 10-year-old girl under each strategy.
Before doing any actual coding:
Here is the Markov model diagram:
| parameter code | definition | data type |
|---|---|---|
| p_init_tan | probability of initiating tanning | table |
| p_halt_tan | probability of discontinuing tanning | table |
| p_nontan_to_cancer | probability of having skin cancer among non-tanners | constant |
| p_regtan_to_cancer | probability of having skin cancer among regular tanners | constant |
| p_cancer_to_dead | probability of cancer-specific death | constant |
| p_mort | probability of natural death | table |
CEAutil package.library(CEAutil)
data(ONtan)
ltable <- ONtan$lifetable
behavior <- ONtan$behavior
print(head(ltable))
## age qx
## 1 0 0.00468
## 2 1 0.00017
## 3 2 0.00014
## 4 3 0.00012
## 5 4 0.00011
## 6 5 0.00009
print(head(behavior))
## age p_init_tanning p_halt_tanning
## 1 10 0.05 0.01
## 2 11 0.05 0.01
## 3 12 0.10 0.01
## 4 13 0.10 0.01
## 5 14 0.11 0.01
## 6 15 0.12 0.01
p_nontan_to_cancer <- 0.005
p_regtan_to_cancer <- 0.04
p_cancer_to_dead <- 0.07
In addition, we have other required variables
n_t)state_names)v_init)n_t <- 100
state_names <- c("nontan", "regtan", "cancer", "deadnature", "deadcancer")
v_init <- c(1, 0, 0, 0, 0)
markov_model() function. In this model framework, we use list() to combine all the parameters, data, and variables defined beforehand. We provide the name for each the element in the list.l_param_all <- list(p_nontan_to_cancer = 0.005,
p_regtan_to_cancer = 0.04,
p_cancer_to_dead = 0.07,
ltable = ltable,
behavior = behavior,
state_names = c("nontan", "regtan", "cancer", "deadnature", "deadcancer"),
n_t = 100,
v_init = c(1, 0, 0, 0, 0))
#### 1. Read in, set, or transform parameters if needed
# We need the age index for matching values of the lifetable and behavior data.
ages <- c(10 : (10 + n_t - 1))
# We extract the probability of natural death age 10-110 from the lifetable.
p_mort <- ltable$qx[match(ages, ltable$age)]
# We modify the values of tanning behavior based on strategy of interest.
if (strategy == "targeted ban") {
behavior$p_init_tanning[behavior$age <= 18] <- 0
}
if (strategy == "universal ban") {
behavior$p_init_tanning <- 0
}
# We extract the tanning behavior at age 10-110 from the behavior data
p_init_tan <- behavior$p_init_tanning[match(ages, behavior$age)]
p_halt_tan <- behavior$p_halt_tanning[match(ages, behavior$age)]
# We get the number of health states based on the length of the string vector, state_names
n_states <- length(state_names)
## state1 state2 state3
## state1 0.1 0.2 0.7
## state2 0.5 0.1 0.4
## state3 0.7 0.0 0.3
print(rowSums(transit_matrix))
## state1 state2 state3
## 1 1 1
tr_mat <- array(0, dim = c(n_states, n_states, n_t),
dimnames = list(state_names, state_names, ages))
# 1. Fill out the transition probabilities from non-tanner to other states
tr_mat["nontan", "regtan", ] <- p_init_tan
tr_mat["nontan", "cancer", ] <- p_nontan_to_cancer
tr_mat["nontan", "deadnature", ] <- p_mort
tr_mat["nontan", "nontan", ] <- 1 - p_init_tan - p_nontan_to_cancer - p_mort
# 2. Fill out the transition probabilities from regular tanner to other states
tr_mat["regtan", "nontan", ] <- p_halt_tan
tr_mat["regtan", "cancer", ] <- p_regtan_to_cancer
tr_mat["regtan", "deadnature", ] <- p_mort
tr_mat["regtan", "regtan", ] <- 1 - p_halt_tan - p_regtan_to_cancer - p_mort
# 3. Fill out the transition probabilities from skin cancer to other states.
# Be careful that this is a tunnel state. Therefore, there is no self loop.
tr_mat["cancer", "deadcancer", ] <- p_cancer_to_dead
tr_mat["cancer", "deadnature", ] <- p_mort
tr_mat["cancer", "nontan", ] <- 1 - p_cancer_to_dead - p_mort
# 4. Fill out the transition probabilities for cancer specific death (this is an absorbing state!!)
tr_mat["deadcancer", "deadcancer", ] <- 1
# 5. Fill out the transition probabilities for natural death (this is an absorbing state!!)
tr_mat["deadnature", "deadnature", ] <- 1
print(tr_mat[, , "20"])
## nontan regtan cancer deadnature deadcancer
## nontan 0.82477 0.17000 0.005 0.00023 0.00
## regtan 0.01000 0.94977 0.040 0.00023 0.00
## cancer 0.92977 0.00000 0.000 0.00023 0.07
## deadnature 0.00000 0.00000 0.000 1.00000 0.00
## deadcancer 0.00000 0.00000 0.000 0.00000 1.00
print(tr_mat[, , "50"])
## nontan regtan cancer deadnature deadcancer
## nontan 0.98303 0.01000 0.005 0.00197 0.00
## regtan 0.50000 0.45803 0.040 0.00197 0.00
## cancer 0.92803 0.00000 0.000 0.00197 0.07
## deadnature 0.00000 0.00000 0.000 1.00000 0.00
## deadcancer 0.00000 0.00000 0.000 0.00000 1.00
# Check whether the transition matrices have any negative values or values > 1!!!
if (sum(tr_mat > 1) | sum(tr_mat < 0)) stop("there are invalid transition probabilities")
# Check whether each row of a transition matrix sum up to 1!!!
if (any(rowSums(t(apply(tr_mat, 3, rowSums))) != n_states)) stop("transition probabilities do not sum up to one")
v_init)#### 3. Create the trace matrix to track the changes in the population distribution through time
#### You could also create other matrix to track different outcomes,
#### e.g., costs, incidence, etc.
trace_mat <- matrix(0, ncol = n_states, nrow = n_t + 1,
dimnames = list(c(10 : (10 + n_t)), state_names))
# Modify the first row of the trace_mat using the v_init
trace_mat[1, ] <- v_init
# Suppose that we want to track the cost of having skin cancer for a year
trace_cost <- rep(0, n_t)
for() loop to iterate the calculation through time.#### 4. Compute the Markov model over time by iterating through time steps
for(t in 1 : n_t){
trace_mat[t + 1, ] <- trace_mat[t, ] %*% tr_mat[, , t]
}
The outcome in this example is “expected life years” (LE). Therefore, we remove the last two columns.
data.frame with the strategy in the first column and LE as the second column.You could add more columns to the data.frame if you have more outcomes of interest (e.g., cost)
#### 5. Organize outputs
LE <- sum(rowSums(trace_mat[, -c(4, 5)])) - 1
output <- data.frame("strategy" = "null",
"LE" = LE)
return(output)
return(list(output = output, trace = trace_mat))
CEAutil package.Need discussion!! Should we illustrate how to calculate cost?
library(CEAutil)
print(markov_model)
## function(l_param_all,
## strategy = NULL) {
## with(as.list(l_param_all), {
## #### Step 1. Read in, set, or transform parameters if needed
## ages <- c(10 : (10 + n_t - 1))
## p_mort <- ltable$qx[match(ages, ltable$age)]
##
## if (strategy == "targeted ban") {
## behavior$p_init_tanning[behavior$age <= 18] <- 0
## }
## if (strategy == "universal ban") {
## behavior$p_init_tanning <- 0
## }
##
## p_init_tan <- behavior$p_init_tanning[match(ages, behavior$age)]
## p_halt_tan <- behavior$p_halt_tanning[match(ages, behavior$age)]
##
## n_states <- length(state_names)
##
## #### Step 2. Create the transition probability matrices using array
## tr_mat <- array(0, dim = c(n_states, n_states, n_t),
## dimnames = list(state_names, state_names, ages))
##
## ## Start filling out transition probabilities in the array
## ## When you fill out the transition probabilities, always remember to deal with
## ## the self-loop the LAST!!!!!
## # 1. Fill out the transition probabilities from non-tanner to other states
## tr_mat["nontan", "regtan", ] <- p_init_tan
## tr_mat["nontan", "cancer", ] <- p_nontan_to_cancer
## tr_mat["nontan", "deadnature", ] <- p_mort
## tr_mat["nontan", "nontan", ] <- 1 - p_init_tan - p_nontan_to_cancer - p_mort
##
## # 2. Fill out the transition probabilities from regular tanner to other states
## tr_mat["regtan", "nontan", ] <- p_halt_tan
## tr_mat["regtan", "cancer", ] <- p_regtan_to_cancer
## tr_mat["regtan", "deadnature", ] <- p_mort
## tr_mat["regtan", "regtan", ] <- 1 - p_halt_tan - p_regtan_to_cancer - p_mort
##
## # 3. Fill out the transition probabilities from skin cancer to other states.
## # Be careful that this is a tunnel state. Therefore, there is no self loop.
## tr_mat["cancer", "deadcancer", ] <- p_cancer_to_dead
## tr_mat["cancer", "deadnature", ] <- p_mort
## tr_mat["cancer", "nontan", ] <- 1 - p_cancer_to_dead - p_mort
##
## # 4. Fill out the transition probabilities for cancer specific death (this is an absorbing state!!)
## tr_mat["deadcancer", "deadcancer", ] <- 1
##
## # 5. Fill out the transition probabilities for natural death (this is an absorbing state!!)
## tr_mat["deadnature", "deadnature", ] <- 1
##
## # Check whether the transition matrices have any negative values or values > 1!!!
## if (sum(tr_mat > 1) | sum(tr_mat < 0)) stop("there are invalid transition probabilities")
##
## # Check whether each row of a transition matrix sum up to 1!!!
## if (any(rowSums(t(apply(tr_mat, 3, rowSums))) != n_states)) stop("transition probabilities do not sum up to one")
##
## #### Step 3. Create the trace matrix to track the changes in the population distribution through time
## #### You could also create other matrix to track different outcomes,
## #### e.g., costs, incidence, etc.
## trace_mat <- matrix(0, ncol = n_states, nrow = n_t + 1,
## dimnames = list(c(10 : (10 + n_t)), state_names))
## # Modify the first row of the trace_mat using the v_init
## trace_mat[1, ] <- v_init
##
## # Suppose that we want to track the cost of having skin cancer for a year
## trace_cost <- rep(0, n_t)
##
## #### Step 4. Compute the Markov model over time by iterating through time steps
## for(t in 1 : n_t){
## trace_mat[t + 1, ] <- trace_mat[t, ] %*% tr_mat[, , t]
## }
##
## #### Step 5. Organize outputs
## LE <- sum(rowSums(trace_mat[, !grepl("dead", state_names)])) - 1
## output <- data.frame("strategy" = strategy,
## "LE" = LE)
##
## #### Step 6. Return the relevant results
## return(output)
## })
## }
## <bytecode: 0x7fded54d3cb0>
## <environment: namespace:CEAutil>
markov_model() function has another argument strategy.print(markov_model(l_param_all = l_param_all, strategy = "null"))
## strategy LE
## 1 null 70.55568
print(markov_model(l_param_all = l_param_all, strategy = "targeted ban"))
## strategy LE
## 1 targeted ban 71.24541
print(markov_model(l_param_all = l_param_all, strategy = "universal ban"))
## strategy LE
## 1 universal ban 72.44861
markov_decision_wrapper() is to calculate the results of all strategies all at once.
vary_param_ls: the parameters that could change in other more advance analysesother_input_ls: the paramters, datasets, and variables that do not change from simulation to simulationuserfun: the Markov model function. You could also create your own Markov model functionstrategy_set: The vector of the strategy names.vary_param_ls <- list(p_nontan_to_cancer = 0.005,
p_regtan_to_cancer = 0.04,
p_cancer_to_dead = 0.07)
other_input_ls <- list(ltable = ltable,
behavior = behavior,
state_names = c("nontan", "regtan", "cancer", "deadnature", "deadcancer"),
n_t = 100,
v_init = c(1, 0, 0, 0, 0))
res <- markov_decision_wrapper(vary_param_ls = vary_param_ls,
other_input_ls = other_input_ls,
userfun = markov_model,
strategy_set = c("null", "targeted ban", "universal ban"))
print(res)
## strategy LE
## 1 null 70.55568
## 2 targeted ban 71.24541
## 3 universal ban 72.44861
l_param_all easily.vary_param_ls <- list(p_nontan_to_cancer = 0.005,
p_regtan_to_cancer = 0.08,
p_cancer_to_dead = 0.1)
res <- markov_decision_wrapper(vary_param_ls = vary_param_ls,
other_input_ls = other_input_ls,
userfun = markov_model,
strategy_set = c("null", "targeted ban", "universal ban"))
print(res)
## strategy LE
## 1 null 67.24277
## 2 targeted ban 68.95024
## 3 universal ban 72.04560
dampackdampackdampack is the decision analysis modeling package here.library(CEAutil)
library(dampack)
p_bite), the cost of the antibiotics (C_drug), and the probability of infection after he bites a guest (p_inf_bitten), respectivelyrun_owsa_det() is for the one-way sensitivity analysis. Let’s take a look into the run_owsa_det() function:run_owsa_det(params_range, params_basecase, nsamp = 100, FUN,
outcomes = NULL, strategies = NULL, ...)
params_range: a data.frame with 3 columns in the following order: “pars”, “min”, and “max”.params_range <- data.frame(pars = c("p_bite", "C_drug", "p_inf_bitten"),
min = c(0.05, 2, 0.2),
max = c(0.8, 20, 0.9))
print(params_range)
## pars min max
## 1 p_bite 0.05 0.8
## 2 C_drug 2.00 20.0
## 3 p_inf_bitten 0.20 0.9
params_basecase: a named list of basecase values for input parameters needed by the user-defined decision function FUN.treetxt <- parse_amua_tree("AmuaExample/DraculaParty_Export/main.R")
treefunc <- treetxt[["treefunc"]] # This is the tree function text from Amua
param_ls <- treetxt[["param_ls"]] # This is a list of parameters with basecase values
print(param_ls)
## $p_bite
## [1] 0.25
##
## $p_inf_bitten
## [1] 0.5
##
## $p_inf_not_allergic
## [1] 0.4
##
## $p_inf_bitten_UA
## [1] 0.05
##
## $C_hospital
## [1] 500
##
## $C_donothing
## [1] 5
##
## $C_drug
## [1] 10
##
## $C_drug_UA
## [1] 0.5
##
## $p_hospital
## [1] 0.66
##
## $p_not_allergic
## [1] 0.95
##
## $red_blood
## [1] 0.1
##
## $val_blood
## [1] 470
nsamp: number of samples. Default is 100.
FUN: user-defined function that takes the basecase in params_basecase and ... to produce the outcome(s) of interest. The FUN must return a dataframe where the first column are the strategy names and the rest of the columns must be outcomes. The function used in our tree example is dectree_wrapper().
outcomes: The outcomes of interest. If you use the default setting NULL, run_owsa_det will run the one-way sensitivity analysis for every outcome. In our case, the outcomes include expectedEff1, expectedEff2, and Cost. You could also specify a subset of outcomes:
outcomes = c("expectedEff1", "Cost")
strategies: You can leave this as default or give your favorite names to the strategies.
...: Additional arguments for the user-defined function FUN. In our case, dectree_wrapper() requires params_basecase, treefunc, and popsize. We have already set up the params_basecase. Thus, for ..., we need to add treefunc and popsize.
owsa_out <- run_owsa_det(params_range, param_ls, nsamp = 100,
dectree_wrapper,
treefunc = treefunc, popsize = 200)
print(str(owsa_out))
## List of 3
## $ owsa_expectedBlood :Classes 'owsa' and 'data.frame': 900 obs. of 4 variables:
## ..$ parameter : Factor w/ 3 levels "p_bite","C_drug",..: 1 1 1 1 1 1 1 1 1 1 ...
## ..$ strategy : Factor w/ 3 levels "st_Donothing",..: 1 1 1 1 1 1 1 1 1 1 ...
## ..$ param_val : num [1:900] 0.05 0.0576 0.0652 0.0727 0.0803 ...
## ..$ outcome_val: num [1:900] 4545 5234 5922 6611 7299 ...
## $ owsa_expectedHospitalization:Classes 'owsa' and 'data.frame': 900 obs. of 4 variables:
## ..$ parameter : Factor w/ 3 levels "p_bite","C_drug",..: 1 1 1 1 1 1 1 1 1 1 ...
## ..$ strategy : Factor w/ 3 levels "st_Donothing",..: 1 1 1 1 1 1 1 1 1 1 ...
## ..$ param_val : num [1:900] 0.05 0.0576 0.0652 0.0727 0.0803 ...
## ..$ outcome_val: num [1:900] 3.3 3.8 4.3 4.8 5.3 5.8 6.3 6.8 7.3 7.8 ...
## $ owsa_Cost :Classes 'owsa' and 'data.frame': 900 obs. of 4 variables:
## ..$ parameter : Factor w/ 3 levels "p_bite","C_drug",..: 1 1 1 1 1 1 1 1 1 1 ...
## ..$ strategy : Factor w/ 3 levels "st_Donothing",..: 1 1 1 1 1 1 1 1 1 1 ...
## ..$ param_val : num [1:900] 0.05 0.0576 0.0652 0.0727 0.0803 ...
## ..$ outcome_val: num [1:900] 2650 2900 3150 3400 3650 3900 4150 4400 4650 4900 ...
## NULL
print(head(owsa_out$owsa_Cost))
## parameter strategy param_val outcome_val
## 1 p_bite st_Donothing 0.05000000 2650
## 2 p_bite st_Donothing 0.05757576 2900
## 3 p_bite st_Donothing 0.06515152 3150
## 4 p_bite st_Donothing 0.07272727 3400
## 5 p_bite st_Donothing 0.08030303 3650
## 6 p_bite st_Donothing 0.08787879 3900
plot(owsa_out$owsa_expectedBlood)
plot(owsa_out$owsa_Cost)
p_bite and C_drug both changes.run_twsa_det() allows us to investigate two-way sensitivity analysis for any pair of parameters.run_twsa_det(params_range, params_basecase, nsamp = 40,
FUN, outcomes = NULL, strategies = NULL, ...)
run_owsa_det(). The primary difference is the function can only take two parameters at a time in the params_range.params_range <- data.frame(pars = c("p_bite", "C_drug"),
min = c(0.05, 2),
max = c(0.8, 20))
twsa_out <- run_twsa_det(params_range, param_ls, nsamp = 10,
dectree_wrapper,
treefunc = treefunc, popsize = 200)
plot(twsa_out$twsa_expectedBlood)
plot(twsa_out$twsa_Cost)
Need discussion!!
gen_psa_samp()run_psalibrary(CEAutil)
library(data.table)
##
## Attaching package: 'data.table'
## The following objects are masked from 'package:dplyr':
##
## between, first, last
library(dampack)
data(ONtan)
ltable <- ONtan$lifetable
behavior <- ONtan$behavior
vary_param_ls <- list(p_nontan_to_cancer = 0.005,
p_regtan_to_cancer = 0.04,
p_cancer_to_dead = 0.07)
other_input_ls <- list(ltable = ltable,
behavior = behavior,
state_names = c("nontan", "regtan", "cancer", "deadnature", "deadcancer"),
n_t = 100,
v_init = c(1, 0, 0, 0, 0))
res <- markov_decision_wrapper(vary_param_ls = vary_param_ls,
other_input_ls = other_input_ls,
userfun = markov_model,
strategy_set = c("null", "targeted ban", "universal ban"))
params_range <- data.frame(pars = c("p_regtan_to_cancer", "p_cancer_to_dead"),
min = c(0.01, 0.01),
max = c(0.1, 0.1))
owsa_out <- run_owsa_det(params_range, vary_param_ls, nsamp = 100,
FUN = markov_decision_wrapper,
userfun = markov_model,
other_input_ls = other_input_ls,
strategy_set = c("null", "targeted ban", "universal ban"))
plot(owsa_out)
twsa_out <- run_twsa_det(params_range, vary_param_ls, nsamp = 10,
FUN = markov_decision_wrapper,
userfun = markov_model,
other_input_ls = other_input_ls,
strategy_set = c("null", "targeted ban", "universal ban"))
plot(twsa_out)